Exploring Measures of "Readability" for Spoken Language: Analyzing linguistic features of subtitles to identify age-specific TV programs
نویسندگان
چکیده
We investigate whether measures of readability can be used to identify age-specific TV programs. Based on a corpus of BBC TV subtitles, we employ a range of linguistic readability features motivated by Second Language Acquisition and Psycholinguistics research. Our hypothesis that such readability features can successfully distinguish between spoken language targeting different age groups is fully confirmed. The classifiers we trained on the basis of these readability features achieve a classification accuracy of 95.9%. Investigating several feature subsets, we show that the authentic material targeting specific age groups exhibits a broad range of linguistics and psycholinguistic characteristics that are indicative of the complexity of the language used.
منابع مشابه
The Representation of Non-Linguistic Sounds in Persian and English Subtitles for the Deaf and Hard-of-Hearing: A Comparative Study
Subtitling for the deaf and hard-of-hearing (SDH) is an area which deserves a special attention as it ena- bles these people to access to the part of the ‘world’ intended for hearing people, including the world of ‘motion pictures’, and particularly movie sounds. Compared to linguistic sounds, non-linguistic sounds have received little attention in the field of translation, although they are in...
متن کاملTHE EFFECT OF STANDARD AND REVERSED SUBTITLING VERSUS NO SUBTITLING MODE ON L2 VOCABULARY LEARNING
Audiovisual material accompanied by interlingual subtitles is a powerful pedagogical tool which can help improve the vocabulary learning of second-language learners. This study was intended to determine whether or not the mode (standard and reversed) of subtitling affects the incidental vocabulary acquisition of Iranian L2 learners while watching TV programs. Forty-five participants were random...
متن کاملExploring Aphasia in Kalhori
Objectives: Despite numerous studies conducted to explore the manifestations of aphasia in different languages of the world, language-specific patterns of aphasic patients in Kalhori as a southern dialect of Kurdish spoken in part of Kermanshah Province, Iran, remains largely unpacked. The present study aims at investigating language deficits of a forty-year-old Kurdish-Persian aphasic woman, h...
متن کاملAn Analysis of Audiovisual Subtitling Translation Focusing on Wordplays from English into Persian in the Friends TV Series
Translation of humor and transferring its effect is one of the most challenging tasks of a translator due to the cultural clashes between the source language (SL) and the target language (TL). Accordingly, the pre- sent study aimed to specify the most frequently applied strategies in terms of Delabastita’s wordplay model used in SL and their translation strategy by Persian translators acc...
متن کاملNamed Entities in Indexing: A Case Study of TV Subtitles and Metadata Records
This paper explores the possible role of named entities in an automatic indexing process, based on text in subtitles. This is done by analyzing entity types, name density and name frequencies in subtitles and metadata records from different TV programs. The name density in metadata records is much higher than the name density in subtitles, and named entities with high frequencies in the subtitl...
متن کامل